49 research outputs found

    Strong association between pseudogenization mechanisms and gene sequence length

    Get PDF
    <p>Abstract</p> <p>Pseudogenes arise from the decay of gene copies following either RNA-mediated duplication (processed pseudogenes) or DNA-mediated duplication (nonprocessed pseudogenes). Here, we show that long protein-coding genes tend to produce more nonprocessed pseudogenes than short genes, whereas the opposite is true for processed pseudogenes. Protein-coding genes longer than 3000 bp are 6 times more likely to produce nonprocessed pseudogenes than processed ones.</p> <p>Reviewers</p> <p>This article was reviewed by Dr. Dan Graur and Dr. Craig Nelson (nominated by Dr. J Peter Gogarten).</p

    Assessing the genomic evidence for conserved transcribed pseudogenes under selection

    Get PDF
    <p>Abstract</p> <p>Background</p> <p><it>Transcribed pseudogenes </it>are copies of protein-coding genes that have accumulated indicators of coding-sequence decay (such as frameshifts and premature stop codons), but nonetheless remain transcribed. Recent experimental evidence indicates that transcribed pseudogenes may regulate the expression of homologous genes, through antisense interference, or generation of small interfering RNAs (siRNAs). Here, we assessed the genomic evidence for such transcribed pseudogenes of potential functional importance, in the human genome. The most obvious indicators of such functional importance are significant evidence of conservation and selection pressure.</p> <p>Results</p> <p>A variety of pseudogene annotations from multiple sources were pooled and filtered to obtain a subset of sequences that have significant mid-sequence disablements (frameshifts and premature stop codons), and that have clear evidence of full-length mRNA transcription. We found 1750 such transcribed pseudogene annotations (TPAs) in the human genome (corresponding to ~11.5% of human pseudogene annotations). We checked for syntenic conservation of TPAs in other mammals (rhesus monkey, mouse, rat, dog and cow). About half of the human TPAs are conserved in rhesus monkey, but strikingly, very few in mouse (~3%). The TPAs conserved in rhesus monkey show evidence of selection pressure (relative to surrounding intergenic DNA) on: <it>(i) </it>their GC content, and <it>(ii) </it>their rate of nucleotide substitution. This is in spite of distributions of Ka/Ks (ratios of non-synonymous to synonymous substitution rates), congruent with a lack of protein-coding ability. Furthermore, we have identified 68 human TPAs that are syntenically conserved in at least two other mammals. Interestingly, we observe three TPA sequences conserved in dog that have intermediate character (<it>i.e.</it>, evidence of both protein-coding ability and pseudogenicity), and discuss the implications of this.</p> <p>Conclusion</p> <p>Through evolutionary analysis, we have identified candidate sequences for functional human transcribed pseudogenes, and have pinpointed 68 strong candidates for further investigation as potentially functional transcribed pseudogenes across multiple mammal species.</p

    Mining Mammalian Transcript Data for Functional Long Non-Coding RNAs

    Get PDF
    BACKGROUND: The role of long non-coding RNAs (lncRNAs) in controlling gene expression has garnered increased interest in recent years. Sequencing projects, such as Fantom3 for mouse and H-InvDB for human, have generated abundant data on transcribed components of mammalian cells, the majority of which appear not to be protein-coding. However, much of the non-protein-coding transcriptome could merely be a consequence of 'transcription noise'. It is therefore essential to use bioinformatic approaches to identify the likely functional candidates in a high throughput manner. PRINCIPAL FINDINGS: We derived a scheme for classifying and annotating likely functional lncRNAs in mammals. Using the available experimental full-length cDNA data sets for human and mouse, we identified 78 lncRNAs that are either syntenically conserved between human and mouse, or that originate from the same protein-coding genes. Of these, 11 have significant sequence homology. We found that these lncRNAs exhibit: (i) patterns of codon substitution typical of non-coding transcripts; (ii) preservation of sequences in distant mammals such as dog and cow, (iii) significant sequence conservation relative to their corresponding flanking regions (in 50% cases, flanking regions do not have homology at all; and in the remaining, the degree of conservation is significantly less); (iv) existence mostly as single-exon forms (8/11); and, (v) presence of conserved and stable secondary structure motifs within them. We further identified orthologous protein-coding genes that are contributing to the pool of lncRNAs; of which, genes implicated in carcinogenesis are significantly over-represented. CONCLUSION: Our comparative mammalian genomics approach coupled with evolutionary analysis identified a small population of conserved long non-protein-coding RNAs (lncRNAs) that are potentially functional across Mammalia. Additionally, our analysis indicates that amongst the orthologous protein-coding genes that produce lncRNAs, those implicated in cancer pathogenesis are significantly over-represented, suggesting that these lncRNAs could play an important role in cancer pathomechanisms

    Uracil content of 16S rRNA of thermophilic and psychrophilic prokaryotes correlates inversely with their optimal growth temperatures

    Get PDF
    We report here the finding of a highly significant inverse correlation of the uracil content of 16S rRNA and the optimum growth temperature (T(opt)) of cultured thermophilic and psychrophilic prokaryotes. This correlation was significantly different from the weaker correlations between the contents of other nucleotides and T(opt). Analysis of the 16S rRNA secondary structure regions revealed a fall in the A:U base-pair content in step with the increase in T(opt) that was much steeper than that of mismatched base-pairs, which are thermodynamically less stable. These findings indicate that the 16S rRNA sequences of thermophiles and psychrophiles are under a strong thermo-adaptive pressure, and that structure–function constraints play a crucial role in determining their 16S rRNA nucleotide composition. The derived relationship between uracil content and T(opt) was used to develop an algorithm to predict the T(opt) values of uncultured prokaryotes lacking cultured close relatives and belonging to the phyla predominantly containing thermophiles. This algorithm may be useful in guiding the design of cultivation conditions for hitherto uncultured microbes

    Lack of functional redundancy in the relationship between microbial diversity and ecosystem functioning.

    Get PDF
    1. Biodiversity is declining worldwide with detrimental effects on ecosystems. However, we lack a quantitative understanding of the shape of the relationship between microbial biodiversity and ecosystem function (BEF). This limits our understanding of how microbial diversity depletion can impact key functions for human well-being, including pollutant detoxification. 2. Three independent microcosm experiments were conducted to evaluate the direction (i.e. positive, negative or null) and the shape of the relationships between bacterial diversity and both broad (i.e. microbial respiration) and specialized (i.e. toxin degradation) functions in five Australian and two UK freshwater ecosystems using next-generation sequencing platforms. 3. Reduced bacterial diversity, even after accounting for biomass, caused a decrease in broad (i.e. cumulative microbial respiration) and specialized (biodegradation of two important toxins) functions in all cases. Unlike the positive but decelerating BEF relationship observed most frequently in plants and animals, most evaluated functional measurements were related to bacterial diversity in a non-redundant fashion (e.g. exponentially and/or linearly). 4. Synthesis. Our results suggest that there is a lack of functional redundancy in the relationship between bacterial diversity and ecosystem functioning; thus the consequences of declining microbial diversity on ecosystem functioning and human welfare have likely been considerably underestimated

    Methylome-Wide Association Study of Schizophrenia: Identifying Blood Biomarker Signatures of Environmental Insults

    Get PDF
    Epigenetic studies present unique opportunities to advance schizophrenia research because they can potentially account for many of its clinical features and suggest novel strategies to improve disease management

    The missing link: Bordetella petrii is endowed with both the metabolic versatility of environmental bacteria and virulence traits of pathogenic Bordetellae

    Get PDF
    Gross R, Guzman CA, Sebaihia M, et al. The missing link: Bordetella petrii is endowed with both the metabolic versatility of environmental bacteria and virulence traits of pathogenic Bordetellae. BMC Genomics. 2008;9(1): 449.Background: Bordetella petrii is the only environmental species hitherto found among the otherwise host-restricted and pathogenic members of the genus Bordetella. Phylogenetically, it connects the pathogenic Bordetellae and environmental bacteria of the genera Achromobacter and Alcaligenes, which are opportunistic pathogens. B. petrii strains have been isolated from very different environmental niches, including river sediment, polluted soil, marine sponges and a grass root. Recently, clinical isolates associated with bone degenerative disease or cystic fibrosis have also been described. Results: In this manuscript we present the results of the analysis of the completely annotated genome sequence of the B. petrii strain DSMZ12804. B. petrii has a mosaic genome of 5,287,950 bp harboring numerous mobile genetic elements, including seven large genomic islands. Four of them are highly related to the clc element of Pseudomonas knackmussii B13, which encodes genes involved in the degradation of aromatics. Though being an environmental isolate, the sequenced B. petrii strain also encodes proteins related to virulence factors of the pathogenic Bordetellae, including the filamentous hemagglutinin, which is a major colonization factor of B. pertussis, and the master virulence regulator BvgAS. However, it lacks all known toxins of the pathogenic Bordetellae. Conclusion: The genomic analysis suggests that B. petrii represents an evolutionary link between free-living environmental bacteria and the host-restricted obligate pathogenic Bordetellae. Its remarkable metabolic versatility may enable B. petrii to thrive in very different ecological niches
    corecore